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Field Of The Invention 

The present invention relates to isolated bgl6 nucleic acid sequences which encode 
polypeptides having beta-glucosidase activity. The invention also relates to nucleic acid 
constructs, vectors, and host cells comprising the nucleic acid sequences as well as 
methods for producing recombinant BGL6 polypeptides. 
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Backoround Of The Invention 

Cellulose and hemlcellulose are the most abundant plant materials produced by 
photosynthesis. They can be degraded and used as an energy source by numerous 
microorganisms, including bacteria, yeast and fungi, that produce extracellular enzymes 
capable of hydrolysis of the polymeric substrates to monomeric sugars (Aro et al. , 2001 ). As 
the limits of non-renewable resources approach, the potential of cellulose to become a 
major renewable energy resource is enomnous (Krishna et al., 2001). The effective 
utilization of cellulose through biological processes Is one approach to overcoming the 
shortage of foods, feeds, and fuels (Ohmiya etal.. 1997). 

Cellulases are enzymes that hydrolyze cellulose (beta-1,4-glucan or beta 
D-glucos!dlc linkages) resulting in the formation of glucose, cellobiose, 
celloollgosaccharides, and the like. Cellulases have been traditionally divided into three 
major classes: endoglucanases (EC 3.2.1.4) ("EG"), exoglucanases or cellobiohydrolases 
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(EC 3.2.1.91) ("CBH") and beta-glucosidases ([beta] -D-glucoside glucohydrolase; EC 
3.2.1.21 )("BG"). (Knowlesefa/., 1987; Shulein. 1988). Endoglucanases act mainly on the 
amorphous parts of the cellulose fibre, whereas cellobiohydrolases are also able to degrade 
crystalline cellulose (Nevalainen and Penttila, 1995). Thus, the presence of a 
cellobiohydrolase in a cellulase system is required for efficient solubilization of crystalline 
cellulose (Suumakki, et aL 2000). Beta-glucosidase acts to liberate D-glucose units from 
cellobiose, cello-oligosaccharides, and other glucosides (Freer, 1993). 

Cellulases are known to be produced by a large number of bacteria, yeast and fiingi. 
Certain fungi produce a complete cellulase system capable of degrading crystalline forms of 
cellulose, such that the cellulases are readily produced in large quantities via fermentation. 
Filamentous fungi play a special role since many yeast, such as Saccharomyces cerevisiae, 
lack the ability to hydrolyze cellulose. See. e.g., Aro et al., 2001; Aubert et aL, 1988; Wood 
etal., 1988, and Coughlan, etaL 

The fungal cellulase classifications of CBH. EG and BG can be further expanded to 
include multiple components within each classification. For example, multiple CBHs, EGs 
and BGs have been isolated from a variety of fungal sources including Trichoderma reesei 
which contains known genes for 2 CBHs, i.e., CBH I and CBH II, at least 5 EGs, i.e., EG I, 
EG II , EG III, EGIV and EGV, and at least 2 BGs, i.e., BG1 and BG2. 

In order to efficiently convert crystalline cellulose to glucose the complete cellulase 
system comprising components from each of the CBH, EG and BG classifications is 
required, with isolated components less effective in hydrolyzing crystalline cellulose (Filho et 
aL, 1996). A synergistic relationship has been observed between cellulase components 
from different classifications. In particular, the EG-type cellulases and CBH- type cellulases 
synergistically interact to more efficiently degrade cellulose. See, e.g., Wood, 1985. 

Cellulases are known in the art to be useful in the treatment of textiles for the 
purposes of enhancing the cleaning ability of detergent compositions, for use as a softening 
agent, for improving the feel and appearance of cotton fabrics, and the like (Kumar et aL, 
1997). 

Cellulase-containing detergent compositions with improved cleaning performance 
(US Pat No. 4,435.307; GB App. Nos. 2,095,275 and 2,094.826) and for use in the 
treatment of fabric to improve the feel and appearance of the textile (US Pat. Nos. 
5,648,263. 5,691,178, and 5.776.757; GB App. No. 1,358,599; The Shizuoka Prefectural 
Hammamatsu Textile Industrial Research Institute Report. Vol. 24, pp. 54-61, 1986), have 
been described. 

Hence, cellulases produced in fungi and bacteria have received significant attention. 
In particular, fennentation of Trichoderma spp. (e.g.. Trichoderma longibrachiatum or 
Trichoderma reesei) has been shown to produce a complete cellulase system capable of 
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degrading crystalline fonns of cellulose. U.S. Pat No. 5,475,101 discloses the purification 
and molecular cloning of one particularly useful enzyme designated EGIII which is derived 
from Trichoderma longlbrachiatum. 

Although cellulase compositions have been previously described, there remains a 
need for new and improved cellulase compositions for use in household detergents, 
stonewashing compositions or laundry detergents, etc. Cellulases that exhibit resistance to 
surfactants (e.g., linear alkyi sulfonates, LAS), improved performance under conditions of 
themrial stress, increased or decreased cellulolytic capacity, and/or high level expression in 
vitro, are of particular interest. 

Summarv Of The Invention 

The invention provides an isolated cellulase protein, identified herein as BGL6, and 
nucleic acids which encode BGL6. 

In one aspect, BGL6 polypeptides or proteins comprise a sequence having at least 
80%, 85%. 90%, 95%, 98% or more sequence identity to the sequence presented as SEQ 
ID NO:2. 

In a related aspect, the invention includes (i) fragments of BGL6, preferably at least 
about 20-100 amino acids in length, more preferably about 100-200 amino acids in length, 
and (ii) a pharmaceutical composition comprising BGL6. In various embodiments, the 
fragment corresponds to the N-terminal domain of BGL6 or the C-terminal domain of BGL6. 

In another aspect the invention includes an isolated polynucleotide having a 
sequence which encodes BGL6, a sequence complementary to the bgl6 coding sequence, 
and a composition comprising the polynucleotide. The polynucleotide may be mRNA, DNA, 
cDNA, genomic DNA, or an antisense analog thereof. 

A bgl6 polynucleotide may comprise an isolated nucleic acid molecule which 
hybridizes to the complement of the nucleic acid presented as SEQ ID NO: 1 under 
moderate to high stringency conditions, where the nucleic acid molecule encodes a BGL6 
polypeptide that exhibits beta-glucosidase activity. 

The polynucleotide may encode a BGL6 protein having at least 80%, 85%, 90%, 
95%. 98% or more sequence identity to the sequence presented as SEQ ID NO:1. In a 
specific embodiment, the polynucleotide comprises a sequence substantially identical to 
SEQ ID NO:1. The invention also contemplates fragments of the polynucleotide, preferably 
at least about 15-30 nucleotides in length. 

The invention further provides recombinant expression vectors containing a nucleic 
acid sequence encoding BGL6 or a fragment or splice variant thereof, operably linked to . 
regulatory elements effective for expression of the protein in a selected host. In a related 
aspect, the invention Includes a host cell containing the vector. 
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The invention further includes a method for producing BGL6 by recombinant 
techniques, by culturing recombinant prol<aryotic or eukaryotic host cells comprising nucleic 
acid sequence encoding BGL6 under conditions effective to promote expression of the 
protein, and subsequent recovery of the protein from the host cell or the cell culture medium. 

In another aspect the invention provides for an enzymatic composition useful in the 
conversion of cellulose to ethanol. In a preferred embodiment the enzymatic composition 
comprises BGL6. The composition may further comprise additional cellulase enzymes such 
as endoglucanases and/or cellobiohydrolases. The composition may be enriched in BGL6. 

In yet another aspect, the invention includes an antibody specifically immunoreactive 
with BGL6. 

Analytical methods for detecting bgl6 nucleic acids and BGL6 proteins also form part 
of the invention. 

Brief Description Of The Fioures 

Figure 1 is a single stranded depiction of the nucleic acid sequence (SEQ ID NO:1), of 
the T. reesei bgl6, wherein the non-coding sequence is indicated as underiined. 

Figure 2 shows the predicted amino acid sequence (SEQ ID NO:2) based on the 
nucleotide sequence provided in Figure 1, wherein the first start codon is utilized. 

Figure 3 shows the predicted amino acid sequence (SEQ ID NO:4) based on the 
nucleotide sequence provided in Figure 1, wherein the second start codon is utilized. 

Figure 4 is the coding sequence bgl6, wherein the two altemate start codons are 
underiined. 

Detailed Description Of The Invention 
1. Definitions. 

Unless otherwise indicated, all technical and scientific terms used herein have the 
same meaning as they would to one skilled in the art of the present invention. Practitioners 
are particulariy directed to Sambrook et aL, 1989, and Ausubel FM et aL, 1993. for 
definitions and terms of the art. It is to be understood that this invention is not limited to the 
particular methodology, protocols, and reagents described, as these may vary. 

All publications cited herein are expressly incorporated herein by reference for the 
purpose of describing and disclosing compositions and methodologies which might be used 
in connection with the invention. 

The term "polypeptide" as used herein refers to a compound made up of a single chain 
of amino acid residues linked by peptide bonds. The temn "protein" as used herein may be 
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synonymous with the temri "polypeptide" or may refer, in addition, to a complex of two or more 
polypeptides. 

The temi "nucleic acid molecule" includes RNA, DNA and cDNA molecules. It will be 
understood that, as a result of the degeneracy of the genetic code, a multitude of nucleotide 
sequences encoding a given protein such as BGL6 may be produced. The present 
invention contemplates every possible variant nucleotide sequence, encoding BGL6, all of 
which are possible given the degeneracy of the genetic code. 

A "heterologous" nucleic acid construct or sequence has a portion of the sequence 
which is not native to the cell in which it is expressed. Heterologous, with respect to a 
control sequence refers to a control sequence {i.e. promoter or enhancer) that does not 
function in nature to regulate the same gene the expression of which it is cunrently 
regulating. Generally, heterologous nucleic acid sequences are not endogenous to the cell 
or part of the genome in which they are present, and have been added to the cell, by 
infection, transfection, transformation, microinjection, electroporation, or the like. A 
"heterologous" nucleic acid construct may contain a control sequence/DNA coding 
sequence combination that is the same as, or different from a control sequence/DNA coding 
sequence combination found in the native cell. 

As used herein, the term "vector" refers to a nucleic acid construct designed for 
transfer between different host cells. An "expression vector" refers to a vector that has the 
ability to incorporate and express heterologous DNA fragments in a foreign cell. Many 
prokaryotic and eukaryotic expression vectors are commercially available. Selection of 
appropriate expression vectors is within the knowledge of those having skill in the art. 

Accordingly, an "expression cassette" or "expression vector" is a nucleic acid 
construct generated recombinantly or synthetically, with a series of specified nucleic acid 
elements that permit transcription of a particular nucleic acid in a target cell. The 
recombinant expression cassette can be incorporated into a plasmid, chromosome, 
mitochondrial DNA, plastid DNA, vims, or nucleic acid fragment. Typically, the recombinant 
expression cassette portion of an expression vector includes, among other sequences, a 
nucleic acid sequence to be transcribed and a promoter. 

As used herein, the terni "plasmid" refers to a circular double-stranded (ds) DNA 
construct used as a cloning vector, and which forms an extrachromosomal self-replicating 
genetic element in many bacteria and some eukaryotes. 

As used herein, the term "selectable marker-encoding nucleotide sequence" refers to 
a nucleotide sequence which is capable of expression in cells and where expression of the 
selectable marker confers to cells containing the expressed gene the ability to grow in the 
presence of a corresponding selective agent, or under corresponding selective growth 
conditions. 
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As used herein, the term "promoter" refers to a nucleic acid sequence that functions 
to direct transcription of a downstream gene. The promoter will generally be appropriate to 
the host cell in which the target gene is being expressed. The promoter together with other 
transcriptional and translational regulatory nucleic acid sequences (also termed "control 
sequences") are necessary to express a given gene. In general, the transcriptional and 
translational regulatory sequences include, but are not limited to, promoter sequences, 
ribosomal binding sites, transcriptional start and stop sequences, translational start and stop 
sequences, and enhancer or activator sequences. 

"Chimeric gene" or "heterologous nucleic acid construct", as defined herein refers to 
a non-native gene (/.e.. one that has been introduced into a host) that may be composed of 
parts of different genes, including regulatory elements. A chimeric gene construct for 
transformation of a host cell is typically composed of a transcriptional regulatory region 
(promoter) operably linked to a heterologous protein coding sequence, or, in a selectable 
marker chimeric gene, to a selectable marker gene encoding a protein conferring antibiotic 
resistance to transformed cells. A typical chimeric gene of the present invention, for 
transformation into a host cell, includes a transcriptional regulatory region that is constitutive 
or inducible, a protein coding sequence, and a terminator sequence. A chimeric gene 
construct may also include a second DNA sequence encoding a signal peptide if secretion 
of the target protein is desired. 

A nucleic acid is "operably linked" when it is placed into a functional relationship with 
another nucleic acid sequence. For example, DNA encoding a secretory leader is operably 
linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the 
secretion of the polypeptide; a promoter or enhancer is operably linked to a coding 
sequence if it affects the transcription of the sequence; or a ribosome binding site is 
operably linked to a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked" means that the DNA sequences being linked are contiguous, 
and, in the case of a secretory leader, contiguous and in reading frame. However, 
enhancers do not have to be contiguous. Linking is accomplished by ligation at convenient 
restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors, linkers or 
primers for PGR are used in accordance with conventional practice. 

As used herein, the term "gene" means the segment of DNA involved in producing a 
polypeptide chain, that may or may not include regions preceding and following the coding 
region, e.g. 5' untranslated (5' UTR) or "leader" sequences and 3" UTR or "trailer" 
sequences, as well as intervening sequences (introns) between individual coding segments 
(exons). 

In general, nucleic acid molecules which encode BGL6 or an analog or homologue 
thereof will hybridize, under moderate to high stringency conditions to the sequence 
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provided herein as SEQ ID NO:1. However, in some cases a BGL6-encoding nucleotide 
sequence is employed that possesses a substantially different codon usage, while the 
protein encoded by the BGL6-encoding nucleotide sequence has the same or substantially 
the same amino acid sequence as the native protein. For example, the coding sequence 
may be modified to facilitate faster expression of BGL6 in a particular prokaryotic or 
eukaryotic expression system, in accordance with the frequency with which a particular 
codon is utilized by the host. Te'o, et al. (2000), for example, describes the optimization of 
genes for expression in filamentous fungi. 

A nucleic acid sequence is considered to be "selectively hybridizable" to a reference 
nucleic acid sequence if the two sequences specifically hybridize to one another under 
moderate to high stringency hybridization and wash conditions. Hybridization conditions are 
based on the melting temperature (Tm) of the nucleic acid binding complex or probe. For 
example, "maximum stringency" typically occurs at about Tm-5°C (5^ below the Tm of the 
probe); "high stringency" at about 5-10^ below the Tm; "intermediate stringency" at about 
10-20'' below the Tm of the probe; and "low stringency" at about 20-25** below the Tm. 
Functionally, maximum stringency conditions may be used to identify sequences having 
strict identity or near-strict identity with the hybridization probe; while high stringency 
conditions are used to identify sequences having about 80% or more sequence identity with 
the probe. 

Moderate and high stringency hybridization conditions are well known in the art (see, 
for example, Sambrook, etal, 1989, Chapters 9 and 11, and in Ausubel, FM., etal., 1993, 
expressly incorporated by reference herein). An example of high stringency conditions 
includes hybridization at about 42°C in 50% fomiamide, 5X SSC, 5X Denhardfs solution, 
0.5% SDS and 100 jig/ml denatured carrier DNA followed by washing two times in 2X SSC 
and 0.5% SDS at room temperature and two additional times in 0.1 X SSC and 0.5% SDS at 
42°C. 

As used herein, "recombinant" includes reference to a cell or vector, that has been 
modified by the introduction of a heterologous nucleic acid sequence or that the cell is 
derived from a cell so modified. Thus, for example, recombinant cells express genes that 
are not found in identical fomri within the native (non-recombinant) form of the cell or express 
native genes that are otherwise abnormally expressed, under expressed or not expressed at 
all as a result of deliberate human intervention. 

As used herein, the temns "transfomied", "stably transformed" or "transgenic" with 
reference to a cell means the cell has a non-native (heterologous) nucleic acid sequence 
integrated into its genome or as an episomal plasmid that Is maintained through multiple 
generations. 
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As used herein, the term "expression" refers to the process by which a polypeptide is 
produced based on the nucleic acid sequence of a gene. The process includes both 
transcription and translation. 

The term "introduced" in the context of inserting a nucleic acid sequence into a cell, 
means "transfection", or "transformation" or "transduction" and includes reference to the 
incorporation of a nucleic acid sequence into a eukaryotic or prol<aryotic cell where the 
nucleic acid sequence may be incorporated into the genome of the cell (for example, 
chromosome, plasmid, plastid, or mitochondrial DNA), converted into an autonomous 
replicon, or transiently expressed (for example, transfected mRNA). 

It follows that the term "BGL6 expression" refers to transcription and translation of 
the dg/6 gene, the products of which include precursor RNA, mRNA, polypeptide, post- 
translationally processed polypeptides, and derivatives thereof, including BGL6 from related 
species such as Trichoderma longibrachiatum (reesei), Trichoderma viride, Trichoderma 
koningii, Hypocrea jecorina and Hypocrea schweinitzii. By way of example, assays for 
BGL6 expression include Western blot for BGL6 protein, Northern blot analysis and reverse 
transcriptase polymerase chain reaction (RT-PCR) assays for BGL6 mRNA, and 
glucosidase activity assays as described in Chen et al, (1992) and Herr et ai (1978). 

The term "alternative splicing" refers to the process whereby multiple polypeptide 
isoforms are generated from a single gene, and involves the splicing together of 
nonconsecutive exons during the processing of some, but not all, transcripts of the gene. 
Thus a particular exon may be connected to any one of several alternative exons to form 
messenger RNAs. The alternatively-spliced mRNAs produce polypeptides ("splice 
variants") in which some parts are common while other parts are different. 

The term "signal sequence" refers to a sequence of amino acids at the N-terminal 
portion of a protein which facilitates the secretion of the mature form of the protein outside 
the cell. The mature fonn of the extracellular protein lacks the signal sequence which is 
cleaved off during the secretion process. 

By the term "host cell" is meant a cell that contains a vector and supports the 
replication, and/or transcription or transcription and translation (expression) of the 
expression construct. Host cells for use in the present invention can be prokaryotic cells, 
such as £. CO//, or eukaryotic cells such as yeast, plant, insect, amphibian, or mammalian 
cells. In general, host cells are filamentous fungi. 

The term "filamentous fungi" means any and all filamentous fungi recognized by 
those of skill in the art. A preferred fungus is selected from the group consisting of 
Aspergillus, Trichoderma, Fusarium, Chrysosporium, Penicillium, Humicola, Neurospora, or 
alternative sexual forms thereof such as Emericella, Hypocrea. 
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The term "cellooligosaccharide" refers to oligosaccharide groups containing from 2-8 
glucose units and having p-1,4 linl^ages, e.g., cellobiose. 

The term "ceilulase" refers to a category of enzymes capable of hydrolyzing cellulose 
polymers to shorter cello-oilgosaccharide oligomers, cellobiose and/or glucose. Numerous 
examples of cellulases, such as exogiucanases, exoceilobiohydroiases, endoglucanases, 
and glucosidases have been obtained from cellulolytic organisms, particularly including 
fungi, plants and bacteria. 

The term "cellulose binding domain" as used herein refers to portion of the amino 
acid sequence of a ceilulase or a region of the enzyme that is involved in the cellulose 
binding activity of a ceilulase or derivative thereof. Cellulose binding domains generally 
function by non-covalently binding the ceilulase to cellulose, a cellulose derivative or other 
polysaccharide equivalent thereof. Cellulose binding domains permit or facilitate hydrolysis 
of cellulose fibers by the structurally distinct catalytic core region, and typically function 
independent of the catalytic core. Thus, a cellulose binding domain will not possess the 
significant hydrolytic activity attributable to a catalytic core. In other words, a cellulose 
binding domain is a structural element of the ceilulase enzyme protein tertiary structure that 
is distinct from the structural element which possesses catalytic activity. 

As used herein, the term "surfactant" refers to any compound generally recognized in 
the art as having surface active qualities. Thus, for example, surfactants comprise anionic, 
cationic and nonionic surfactants such as those commonly found in detergents. Anionic 
surfactants include linear or branched alkylbenzenesulfonates; alkyi or alkenyl ether sulfates 
having linear or branched alkyI groups or alkenyl groups; alkyi or alkenyl sulfates; 
olefinsulfonates; and alkanesulfonates. Ampholytic surfactants include quaternary 
ammonium salt sulfonates, and betaine-type ampholytic surfactants. Such ampholytic 
surfactants have both the positive and negative charged groups in the same molecule. 
Nonionic surfactants may comprise polyoxyalkylene ethers, as well as higher fatty acid 
alkanolamides or alkylene oxide adduct thereof, fatty acid glycerine monoesters, and the 
like. 

As used herein, the term "cellulose containing fabric" refers to any sewn or unsewn 
fabrics, yams or fibers made of cotton or non-cotton containing cellulose or cotton or non- 
cotton containing cellulose blends including natural cellulosics and manmade cellulosics 
(such as jute, flax, ramie, rayon, and lyocell). 

As used herein, the term "cotton-containing fabric" refers to sewn or unsewn fabrics, 
yarns or fibers made of pure cotton or cotton blends including cotton woven fabrics, cotton 
knits, cotton denims, cotton yarns, raw cotton and the like. 
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As used herein, the term "stonewashing composition" refers to a fomiulation for use 
in stonewashing cellulose containing fabrics. Stonewashing compositions are used to 
modify cellulose containing fabrics prior to sale, i.e.. during the manufacturing process. In 
contrast, detergent compositions are intended for the cleaning of soiled garments and are 
not used during the manufacturing process. 

As used herein, the term "detergent composition" refers to a mbdure which is 
intended for use in a wash medium for the laundering of soiled cellulose containing fabrics. 
In the context of the present invention, such compositions may include, in addition to 
cellulases and surfactants, additional hydrolylic enzymes, builders, bleaching agents, bleach 
activators, bluing agents and fluorescent dyes, caking inhibitors, masl<ing agents, cellulase 
activators, antioxidants, and solubilizers. 

As used herein, the term "decrease or elimination in expression of the bgl6 gene" 
means that either that the bgl6 gene has been deleted from the genome and therefore 
cannot be expressed by the recombinant host microorganism; or that the bgl6 gene has 
been modified such that a functional BGL6 enzyme is not produced by the recombinant host 
microorganism. 

The term "altered bgier or "altered bgl6 gene" means that the nucleic acid sequence 
of the gene has been altered by removing, adding, and/or manipulating the coding sequence 
or the amino acid sequence of the expressed protein has been modified. 

As used herein, the term "purifying" generally refers to subjecting transgenic nucleic 
acid or protein containing cells to biochemical purification and/or column chromatography. 

As used herein, the terms "active" and "biologically active" refer to a biological 
activity associated with a particular protein, such as the enzymatic activity associated with a 
protease. It follows that the biological activity of a given protein refers to any biological 
activity typically attributed to that protein by those of skill in the art. 

As used herein, the term "enriched" means that the BGL6 is found in a concentration 
that is greater relative to the BGL6 concentration found in a wild-type, or naturally occumng, 
fungal cellulase composition. 

A wild type fungal cellulase composition is one produced by a naturally occurring 
fungal source and which comprises one or more BG, CBH and EG components wherein 
each of these components is found at the ratio produced by the fungal source. Thus, an 
enriched BGL6 composition would have BGL6 at an altered ratio wherein the ratio of BGL6 
to other cellulase components (i.e., CBHs and endoglucanases) is elevated. This ratio may 
be increased by either increasing BGL6 or decreasing (or eliminating) at least one other 
component by any means known in the art. 

Thus, to illustrate, a naturally occunring cellulase system may be purified into 
substantially pure components by recognized separation techniques well published in the 
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literature, including ion exchange chromatography at a suitable pH, affinity chromatography, 
size exclusion and the like. For example, in ion exchange chromatography (usually anion 
exchange chromatography), it is possible to separate the cellulase components by eluting 
with a pH gradient, or a salt gradient, or both a pH and a salt gradient. The purified BGL6 
may then be added to the enzymatic solution resulting in an enriched BGL6 solution. 

Fungal cellulases may contain more than one BG component. The different components 
generally have different isoelectric points which allow for their separation via ion exchange 
chromatography and the like. Either a single BG component or a combination of BG 
components may be employed in an enzymatic solution. 

When employed in enzymatic solutions, the BG component is generally added in an 
amount sufficient to prevent inhibition by cellobiose of any CBH and endoglucanase 
components found in the cellulase composition. The amount of BG component added 
depends upon the amount of cellobiose produced during the biomass saccarification 
process which can be readily detemiined by the skilled artisan. However, when employed, 
the weight percent of the BGL6 component relative to any CBH or endoglucanase type 
components present in the cellulase composition is preferably from about 1, preferably 
about 5, preferably about 10, preferably about 15, or preferably about 20 weight percent to 
preferably about 25, preferably about 30, preferably about 35, preferably about 40. 
preferably about 45 or preferably about 50 weight percent. Furthermore, preferred ranges 
may be about 0.5 to about 15 weight percent, about 0.5 to about 20 weight percent, from 
about 1 to about 10 weight percent, from about 1 to about 15 weight percent, from about 1 
to about 20 weight percent, from about 1 to about 25 weight percent, from about 5 to about 
20 weight percent, from about 5 to about 25 weight percent, from about 5 to about 30 weight 
percent, from about 5 to about 35 weight percent, from about 5 to about 40 weight percent, 
from about 5 to about 45 weight percent, from about 5 to about 50 weight percent, from 
about 10 to about 20 weight percent, from about 10 to about 25 weight percent, from about 
10 to about 30 weight percent, from about 10 to about 35 weight percent, from about 10 to 
about 40 weight percent, from about 10 to about 45 weight percent, from about 10 to about 
50 weight percent, from about 15 to about 20 weight percent, from about 15 to about 25 
weight percent, from about 15 to about 30 weight percent, from about 15 to about 35 weight 
percent, from about 15 to about 30 weight percent, from about 15 to about 45 weight 
percent, from about 15 to about 50 weight percent. 

II. Target Organisms 

A. Filamentous funai 

Filamentous fungi include all filamentous fomis of the subdivision Eumycota and 
Oomycota. The filamentous fungi are characterized by vegetative mycelium having a cell 
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wall composed of chitin, glucan, chitosan, mannan, and other complex polysaccharides, 
with vegetative growth by hyphal elongation and carbon catabolism that is obligately 
aerobic. 

In the present invention, the filamentous fungal parent cell may be a cell of a species 
of, but not limited to, Trichoderma, e,g,, Trichoderma longibrachiatum (reesei), Trichoderma 
viride, Trichoderma koningii, Trichoderma harzianum; Peniciiiium sp.; Humicola sp., 
including Humicoia insolens; Chrysosporium sp., including C. lucknowense; Gliocladium sp.; 
Aspergillus sp.; Fusarium sp., Neurospora sp., Hypocrea sp., and Emericeila sp. As used 
herein, the term "Trichoderma'' or ''Trichoderma sp." refers to any fungal strains which have 
previously been classified as Trichoderma or are currently classified as Trichoderma. 

In one preferred embodiment, the filamentous fungal parent cell is an Aspergillus 
r)iger, Aspergillus awamori, Aspergillus aculeatus, or Aspergillus niduians cell. 

In another preferred embodiment, the filamentous fungal parent cell is a Trichoderma 
reesei cell. 

III. Cellulases 

Cellulases are known in the art as enzymes that hydrolyze cellulose (beta-1,4-glucan 
or beta D-glucosidic linkages) resulting in the formation of glucose, cellobiose, 
cellooligosaccharides, and the like. As set forth above, cellulases have been traditionally 
divided into three major classes: endoglucanases (EC 3.2.1.4) ("EG"), exoglucanases or 
cellobiohydrolases (EC 3.2.1.91) ("CBH") and beta-glucosldases (EC 3.2.1.21) ("BG"). 
(Knowles, etaL, 1987; Schulein, 1988), 

Certain fungi produce complete cellulase systems which include exo- 
cellobiohydrolases or CBH-type cellulases, endoglucanases or EG-type cellulases and beta- 
glucosidases or BG-type cellulases (Schulein, 1988). However, sometimes these systems 
lack CBH-type cellulases and bacterial cellulases also typically include little or no CBH-type 
cellulases. In addition, it has been shown that the EG components and CBH components 
synergistically interact to more efficiently degrade cellulose. See, e.g.. Wood, 1985. The 
different components, i.e., the various endoglucanases and exocellobiohydrolases in a 
multi-component or complete cellulase system, generally have different properties, such as 
isoelectric point, molecular weight, degree of glycosylation, substrate specificity and 
enzymatic action patterns. 

It is believed that endoglucanase-type cellulases hydrolyze internal beta -1,4- 
glucosidic bonds in regions of low crystallinity of the cellulose and exo-cellobiohydrolase- 
type cellulases hydrolyze cellobiose from the reducing or non-reducing end of cellulose. It 
follows that the action of endoglucanase components can greatly facilitate the action of exo- 
cellobiohydrolases by creating new chain ends which are recognized by exo- 
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cellobiohydrolase components. Further, beta-glucosidase-type cellulases have been shown 
to catalyze the hydrolysis of alkyi and/or aryl |J-D-glucosldes such as methyl p-D-glucoslde 
and p-nltrophenyl glucoside as well as glycosides containing only carbohydrate residues, 
such as cellobiose. This yields glucose as the sole product for the microorganism and 
reduces or eliminates cellobiose which inhibits cellobiohydrolases and endoglucanases. 

Accordingly, (3-glucosidase-type cellulases are considered to be an Integral part of 
the cellulase system because they drive the overall reaction to glucose. Increased 
expression of BG in T. reesei has been shown to improve degradation of cellulose to 
glucose. See EP0562003. which is hereby incorporated by reference. In addition, |3- 
glucosidases can catalyze the hydrolysis of a number of different substrates, and therefore 
they find utility in a variety of different applications. Some p-glucosidases can be added to 
grapes during wine making to enhance the potential aroma of the finished wine product. Yet 
another application can be to use p-g'ucosidase in fruit to enhance the aroma thereof. 
Alternatively, p-glucosidase can be used directly In food additives or wine processing to 
enhance the flavor and aroma. 

Cellulases also find a number of uses In detergent compositions including to 
enhance cleaning ability, as a softening agent and to Improve the feel of cotton fabrics 
(Hemmpel, 1991; Tyndall, 1992; Kumar et aL, 1997). While the mechanism is not part of the 
invention, softening and color restoration properties of cellulase have been attributed to the 
alkaline endoglucanase components in cellulase compositions, as exemplified by U.S. 
Patent Nos. 5,648,263, 5,691,178. and 5,776.757, which disclose that detergent 
compositions containing a cellulase composition enriched in a specified alkaline 
endoglucanase component impart color restoration and improved softening to treated 
garments as compared to cellulase compositions not enriched in such a component. In 
addition, the use of such alkaline endoglucanase components In detergent compositions has 
been shown to complement the pH requirements of the detergent composition (e.g., by 
exhibiting maximal activity at an alkaline pH of 7.5 to 10, as described in U.S. Patent Nos. 
5,648,263. 5,691.178. and 5.776.757). 

Cellulase compositions have also been shown to degrade cotton-containing fabrics, 
resulting in reduced strength loss in the fabric (U.S. Patent No. 4,822,516), contributing to 
reluctance to use cellulase compositions in commercial detergent applications. Cellulase 
compositions comprising endoglucanase components have been suggested to exhibit 
reduced strength loss for cotton-containing fabrics as compared to compositions comprising 
a complete cellulase system. 

Cellulases have also been shown to be useful in degradation of cellulose biomass to 
ethanol (wherein the cellulase degrades cellulose to glucose and yeast or other microbes 
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further ferment the glucose into ethanol), in the treatment of mechanical pulp (Pare etal., 
1996). for use as a feed additive (WO 91/04673) and in grain wet milling. 

Numerous celiulases have been described in the scientific literature, examples of 
which include: from Trichoderma reeseh Shoemaker, S. etal., Bio/Technology, 1:691-696, 
1983, which discloses CBHI; Teeri. T. et al., Gene, 51:43-52, 1987. which discloses CBHIl; 
Penttila, M. et al., Gene, 45:253-263, 1986, which discloses EGl; Salohelmo. M. et al., 
Gene. 63:11-22, 1988, which discloses EGII; Okada, M. etal.. Appl. Environ. Microbiol., 
64:555-563, 1988, which discloses EGIII; Saloheimo, M. et al., Eur. J. Blochem., 249:58"4- 
591, 1997, which discloses EGIV; Saloheimo, A. et al.. Molecular Microbiology, 13:219-228, 
1994, which discloses EGV; Bamett, C. C, et al., Bio/Technology, 9:562-567, 1991, which 
discloses SGL^, and Takashima, S. et al.. J. Biochem., 125:728-736. 1999. which discloses 
fiGL2. Celiulases from species other than Trichoderma have also been described e.g.. Ool 
et al., 1990, which discloses the cDNA sequence coding for endoglucanase F1-CMC 
produced by Aspergillus aculeatus; KawaguchI T et al., 1996. which discloses the cloning 
and sequencing of the cDNA encoding beta-glucosidase 1 from Aspergillus aculeatus; 
Sakamoto et al., 1995. which discloses the cDNA sequence encoding the endoglucanase 
CMCase-1 from Aspergillus kawachii IFO 4308; Saarilahti etal., 1990 which discloses an 
endoglucanase from Erwinia carotovara; Spilliaert R, ef a/., 1994. which discloses the 
cloning and sequencing of bglA, coding for a thermostable beta-glucanase from 
Rhodothermus marinu; and Halldorsdottir S ef al.. 1998. which discloses the cloning, 
sequencing and overexpression of a Rhodothermus marinus gene encoding a themiostable 
cellulase of glycosyl hydrolase family 12. However, there remains a need for identification 
arid characterization of novel celiulases. with improved properties, such as improved 
performance under conditions of thermal stress or in the presence of surfactants, increased 
specific activity, altered substrate cleavage pattern, and/or high level expression in vitro. 

The development of new and improved cellulase compositions that comprise varying 
amounts CBH-type, EG-type and BG-type celiulases is of interest for use: (1) in detergent 
compositions that exhibit enhanced cleaning ability, function as a softening agent and/or 
Improve the feel of cotton fabrics (e.g., "stone washing" or "biopolishing"); (2) in 
compositions for degrading wood pulp or other biomass into sugars (e.g., for bio-ethanol 
production); and/or (3) in feed compositions. 

IV. Methods of Identifvina Novel Sequences 

Open reading firames (ORFs) are analyzed following full or partial sequencing of the 
7. reese/ genome or of clones of cDNA libraries derived from T. reesei mRNA and are 
further analyzed using sequence analysis software, and by determining homology to known 
sequences In databases (public/private). 
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V. bQl6 Nucleic Acids And BGL6 PolvDeptides. 
A. bale Nucleic acids 

The nucleic acid molecules of the present invention include the native coding 
sequence, the cDNA sequence for bgl6 presented herein as SEQ. ID. NO:1, and 
homologues thereof in other species, naturally occurring allelic and splice variants, nucleic 
acid fragments, and biologically active (functional) derivatives thereof, such as, amino acid 
sequence variants of the native molecule and sequences which encode fusion proteins. 
The bgl6 gene has two putative start codons. The two start codons are underiined In Figure 
4. The sequences are collectively referred to herein as "BGL6-encoding nucleic acid 
sequences". 

A Basic BLASTN search (http://www.ncbi,nlm.nih.gov/BLAST) of the non-redundant 
nucleic acid sequence database was conducted on October 1 , 2002, with the bgl6 gene 
sequence presented in Figure 1 (SEQ ID NO:1), indicated that there were no sequences 
producing significant alignments (i.e. with an E value of lO'^or less). 

A bgl6 nucleic acid sequence of this invention may be a DNA or RNA sequence, 
derived from genomic DNA, cDNA, mRNA, or may be synthesized in whole or in part. The 
DNA may be double-stranded or single-stranded and if single-stranded may be the coding 
strand or the non-coding (antisense, complementary) strand. The nucleic acid sequence 
may be cloned, for example, by isolating genomic DNA from an appropriate source, and 
amplifying and cloning the sequence of interest using a polymerase chain reaction (PGR). 
Alternatively, nucleic acid sequence may be synthesized, either completely or in part, 
especially where it is desirable to provide host-prefen-ed sequences for optimal expression. 
Thus, all or a portion of the desired structural gene (that portion of the gene which encodes 
a polypeptide or protein) may be synthesized using codons preferred by a selected host. 

Due to the inherent degeneracy of the genetic code, nucleic acid sequences other 
than the native form which encode substantially the same or a functionally equivalent amino 
acid sequence may be used to clone and/or express BGL6-encoding nucleic acid 
sequences. Thus, for a given BGL6-encoding nucleic acid sequence, it is appreciated that as 
a result of the degeneracy of the genetic code, a number of coding sequences can be 
produced that encode a protein having the same amino acid sequence. For example, the 
triplet CGT encodes the amino acid arginine. Arginine is alternatively encoded by CGA, CGC, 
CGG, AGA, and AGG. Therefore it is appreciated that such substitutions in the coding region 
fall within the nucleic acid sequence variants covered by the present invention. Any and all of 
these sequence variants can be utilized in the same way as described herein for the native 
form of a BGL6-encoding nucleic acid sequence. 
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A "variant" BGL6-encoding nucleic acid sequence may encode a "variant" BGL6 
amino acid sequence which is altered by one or more amino acids from the native 
polypeptide sequence or may be truncated by removal of one or more amino acids from 
either end of the polypeptide sequence, both of which are included within the scope of the 
invention. Similarly, the temi "modified fomi or, relative to BGL6, means a derivative or 
variant fomi of the native BGL6 protein-encoding nucleic acid sequence or the native BGL6 
amino acid sequence. 

Similariy, the polynucleotides for use in practicing the invention Include sequences 
which encode native BGL6 proteins and splice variants thereof, sequences complementary 
to the native protein coding sequence, and novel fragments of BGL6 encoding 
polynucleotides. A BGL6 encoding nucleic acid sequence may contain one or more intron . 
sequences if it is a genomic DNA sequence. 

In one general embodiment, a BGL6-^ncoding nucleotide sequence has at least 
70%, preferably 80%. 85%, 90%, 95%, 98%, or more sequence identity to the bgl6 coding 
sequence presented herein as SEQ ID NO:1. 

In another embodiment, a BGL6-encoding nucleotide sequence will hybridize under 
moderate to high stringency conditions to a nucleotide sequence that encodes a BGL6 
protein. In a related embodiment, a BGL6-encoding nucleotide sequence will hybridize 
under moderate to high stringency conditions to the nucleotide sequence presented as SEQ 
ID NO:1. 

It is appreciated that some nucleic acid sequence variants that encode BGL6 may or 
may not selectively hybridize to the parent sequence. By way of example, in situations where 
the coding sequence has been optimized based on the degeneracy of the genetic code, a 
variant coding sequence may be produced that encodes a BGL6 protein, but does not 
hybridize to a native BGL6-encoding nucleic acid sequence under moderate to high 
stringency conditions. This would occur, for example, when the sequence variant includes a 
different codon for each of the amino acids encoded by the parent nucleotide. 

As will be further understood by those of skill in the art, in some cases it may be 
advantageous to produce nucleotide sequences possessing non-naturally occurring codons 
e.g., inosine or other non-naturally occurring nucleotide analog. Codons preferred by a 
particular eukaryotic host can be selected, for example, to increase the rate of BGL6 protein 
expression or to produce recombinant RNA transcripts having desirable properties, such as 
a longer half-life, than transcripts produced from the naturally occurring sequence. Hence, a 
native BGL6-encoding nucleotide sequence may be engineered in order to alter the coding 
sequence for a variety of reasons, including but not limited to. alterations which modify the 
cloning, processing and/or expression of the BGL6 protein by a cell. 
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Particularly preferred are nucleic acid substitutions, additions, and deletions that are 
silent such that they do not alter the properties or activities of the native polynucleotide or 
polypeptide. 

The variations can be made using methods known in the art such as oligonucleotide- 
mediated (site-directed) mutagenesis, and PGR mutagenesis. Site-directed mutagenesis 
(Carter ef a/., 1986; Zoller ef a/., 1987), cassette mutagenesis (Wells etal., 1985), restriction 
selection mutagenesis (Wells ef a/., 1986) or other known techniques can be performed on the 
cloned DNA to produce the BGL6 polypeptide-encoding variant DNA. 

However, in some cases it may be advantageous to express variants of bgl6 which 
lack the properties or activities of the native bgl6 polynucleotide or BGL6 polypeptide. In 
such cases, mutant or modified forms of the native BGL6-encoding nucleic acid sequence 
may be generated using techniques routinely employed by those of skill in the art. 

B, BGL6 Polvpeptides 

In one preferred embodiment, the invention provides a BGL6 polypeptide, having a 
native mature or full-length BGL6 polypeptide sequence comprising the sequence presented 
in Rgure 2 (SEQ ID NO:2). A BGL6 polypeptide of the invention can be the mature BGL6 
polypeptide, part of a fusion protein or a fragment or variant of the BGL6 polypeptide 
sequence presented in Figure 2 (SEQ ID NO:2). 

Ordinarily, a BGL6 polypeptide of the invention has at least 80% identity to a BGL6 
amino acid sequence over its entire length. More preferable are BGL6 polypeptide 
sequences that comprise a region having at least 80, 85. 90, 95, 98% or more sequence 
identity to the BGL6 polypeptide sequence of Figure 2 (SEQ ID NO:2), using a sequence 
alignment program, as detailed herein. 

Typically, a "modified fomn or a native BGL6 protein or a "variant" BGL6 protein has 
a derivative sequence containing at least one amino acid substitution, addition, deletion or 
insertion, respectively. 

It is well-known in the art that certain amino acid substitutions may be made in 
protein sequences without affecting the function of the protein. Generally, conservative 
amino acid substitutions or substitutions of similar amino acids are tolerated without 
affecting protein function. Similar amino acids can be those that are similar in size and/or 
charge properties, for example, aspartate and glutamate, and isoleuclne and valine, are 
both pairs of similar amino acids. Similarity between amino acid pairs has been assessed in 
the art in a number of ways. For example, Dayhoff et at. (1978), which is incorporated by 
reference herein provides frequency tables for amino acid substitutions which can be 
employed as a measure of amino acid similarity. Dayhoff ef a/.'s frequency tables are based 
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on comparisons of amino acid sequences for proteins fiaving the same function from a 
variety of evolutionarily different sources. 

Fragments and variants of the BGL6 polypeptide sequence of Figure 2 (SEQ ID 
NO:2), are considered to be a part of the invention. A fragment is a variant polypeptide 
which has an amino acid sequence that is entirely the same as part but not all of the amino 
acid sequence of the previously described polypeptides. The fragments can be "free- 
standing" or comprised within a larger polypeptide of which the fragment forms a part or a 
region, most preferably as a single continuous region. Preferred fragments are biologically 
active fragments which are those fragments that mediate activities of the polypeptides of the 
invention, including those with similar activity or improved activity or with a decreased 
activity. Also included are those fragments that are antigenic or immunogenic in an animal, 
particuiariy a human. In this aspect, the invention includes (i) fragments of BGL6, preferably 
at least about 20-100 amino acids in length, more preferably about 100-200 amino acids in 
length, and (ii) a pharmaceutical composition comprising BGL6. In various embodiments, 
the fragment con-esponds to the N-temiinal domain of BGL6 or the Oterminal domain of 
BGL6. 

BGL6 polypeptides of the invention also include polypeptides that vary from the 
BGL6 polypeptide sequence of Figure 2 (SEQ ID NO:2). These variants may be 
substitutional, insertional or deietional variants. The variants typically exhibit the same 
qualitative biological activity as the naturally occuning analogue, although variants can also be 
selected which have modified characteristics as further described below. 

A "substitution" results from the replacement of one or more nucleotides or amino 
acids by different nucleotides or amino acids, respectively. 

An "insertion" or "addition" is that change in a nucleotide or amino acid sequence 
which has resulted in the addition of one or more nucleotides or amino acid residues, 
respectively, as compared to the naturally occurring sequence. 

A "deletion" is defined as a change in either nucleotide or amino acid sequence in 
which one or more nucleotides or amino acid residues, respectively, are absent. 

Amino acid substitutions are typically of single residues; insertions usually will be on 
the order of from about 1 to 20 amino acids, although considerably larger insertions may be 
tolerated. Deletions range from about 1 to about 20 residues, although in some cases 
deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to an-ive at 
a final derivative. Generally these changes are done on a few amino acids to minimize the 
alteration of the molecule. However, larger changes may be tolerated in certain 
circumstances. 



wo 2004/043980 PCT/US2003/035672 

-21 - 

Amino acid substitutions can be the result of replacing one amino acid witfi another 
amino acid having similar structural and/or chemical properties, such as the replacement of an 
isoleucine with a valine, i.e., conservative amino acid replacements. Insertions or deletions 
may optionally be in the range of 1 to 5 amino acids. 

Substitutions are generally made in accordance with known "conservative 
substitutions". A "conservative substitution" refers to the substitution of an amino add in one 
class by an amino acid in the same class, where a class is defined by common 
physicochemical amino acid side chain properties and high substitution frequencies in 
homologous proteins found in nature (as detemiined. e.g.. by a standard Dayhoff frequency 
exchange matrix or BLOSUM matrix). (See generally, Doolittle, R.F., 1986.) 

A "non-conservative substitution" refers to the substitution of an amino acid in one 
class with an amino acid from another class. 

BGL6 polypeptide variants typically exhibit the same qualitative biological activity as 
the naturally-occuning analogue, although variants also are selected to modify the 
characteristics of the BGL6 polypeptide, as needed. For example, glycosylation sites, and 
more particulariy one or more O-linked or N-linked glycosylation sites may be altered or 
removed. Those skilled in the art will appreciate that amino acid changes may alter post- 
translational processes of the BGL6 polypeptide, such as changing the number or position of 
glycosylation sites or altering the membrane anchoring characteristics or secretion 
characteristics or other cellular localization characteristics. 

Also included within the definition of BGL6 polypeptides are other related BGL6 
polypeptides. Thus, probe or degenerate polymerase chain reaction (PGR) primer sequences 
may be used to find other related polypeptides. Useful probe or primer sequences may be 
designed to: all or part of the BGL6 polypeptide sequence, or sequences outside the coding 
region. As is generally known in the art, preferred PGR primers are from about 15 to about 35 
nucleotides in length, with from about 20 to about 30 being preferred, and may contain inosine 
as needed. The conditions for the PGR reaction are generally known in the art. 

Covalent modifications of BGL6 polypeptides are also included within the scope of this 
invention. For example, the invention provides BGL6 polypeptides that are a mature protein 
and may comprise additional amino or carboxyl-terminal amino acids, or amino acids within 
the mature polypeptide (for example, when the mature fonn of the protein has more than 
one polypeptide chain). Such sequences can, for example, play a role in the processing of 
the protein from a precursor to a mature fonn, allow protein transport, shorten or lengthen 
protein half-life, or facilitate manipulation of the protein in assays or production. As an 
example, it is believed that the instant novel BGL6 polypeptide is an intracellular protein. 
Thus, in order to be exported to the extracellular milieu a secretion signal that is 
subsequently removed may be desirable. 
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Also contemplated are modifications directed to alteration of an active site, alteration 
of the pH optima, temperature optima, and/or substrate affinity of the BGL6 enzyme. 

Figure 2 shows the predicted amino acid sequence (SEQ ID NO:2) of an exemplary 
BGL6 polypeptide based on the nucleotide sequence provided in Figure 1. The predicted 
molecular weight of the encoded BGL6 polypeptide is 92 kDa, No sequence resembling a 
signal peptide (Nielsen, H., Engelbrecht, J., Brunak, S., von Heijne. G., Protein Engineering, 
10:1-6, 1997) is present at the amino terminus of BGL6 suggesting that the BGL6 
polypeptide is not secreted. 

A Basic BlJ^STP search (http://www.ncbi.nlm.nih.gov/BLAST) of the non-redundant 
protein database, conducted on October 1 , 2002 with the BGL6 amino acid sequence 
indicated 42% sequence identity to GenBank Accession Number P07337 (beta-glucosidase 
precursor of Kluyveromyces marxianus var. marxianus), 43% sequence identity to GenBank 
Accession Number AL355920 (beta-glucosidase precursor of Schlzosaccharomyces 
pombe), 38% sequence identity to GenBank Accession Number AF329731 (beta- 
glucosidase of Volvariella volvacea), and 38% sequence identity to GenBank Accession 
Number AJ293760 (putative beta-glucosidase of Agarlcus bisporus). The ten sequences 
having highest identity but less than 43% identity with BGL6 were all annotated as beta- 
glucosidases. These sequence similarities indicate that BGL6 is a member of glycosyl 
hydrolase family 3 (Henrissat, B. and Bairoch, A. (1993) Biochem. J. 293:781-788). 

C. Anti-BGL6 Antibodies. 

The present invention further provides anti-BGL6 antibodies. The antibodies may be 
polyclonal, monoclonal, humanized, bispecific or heteroconjugate antibodies. 

Methods of preparing polyclonal antibodies are known to the skilled artisan. The 
immunizing agent may be a BGL6 polypeptide or a fusion protein thereof. It may be useful to 
conjugate the antigen to a protein known to be immunogenic in the mammal being immunized. 
The immunization protocol may be determined by one skilled in the art based on standard 
protocols or routine experimentation. 

/Mternatively, the anti-BGLG antibodies may be monoclonal antibodies. Monoclonal 
antibodies may be produced by cells immunized in an animal or using recombinant DNA 
methods. (See, e.g., Kohler ef aA, 1975; U.S. Patent No. 4,816,567). 

An anti-BGL6 antibody of the invention may further comprise a humanized or human 
antibody. The term "humanized antibody" refers to humanized forms of non-human (e.g., 
murine) antibodies that are chimeric antibodies, immunoglobulin chains or fragments thereof 
(such as Fv, Fab, Fab\ F(ab')2 or other antigen-binding partial sequences of antibodies) which 
contain some portion of the sequence derived from non-human antibody. Methods for 
humanizing non-human antibodies are well known in the art, as further detailed in Jones et aL, 
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1986; Riechmann et aL, 1988; and Verhoeyen et al., 1988. Methods for producing human 
antibodies are also known in the art. See, e.g., Jakobovits, A, et ai, 1995 and Jakobovits, A, 
1995. 

VI. Expression Of Recombinant BGL6 

The methods of the invention rely on the use cells to express BGL6, with no 
particular method of BGL6 expression required. 

The invention provides host cells which have been transduced, transformed or " 
transfected with an expression vector comprising a BGL6-encoding nucleic acid sequence. 
The culture conditions, such as temperature, pH and the like, are those previously used for 
the parental host cell prior to transduction, transformation or transfection and will be 
apparent to those skilled in the art. 

In one approach, a filamentous fungal cell or yeast cell is transfected with an 
expression vector having a promoter or biologically active promoter fragment or one or more 
(e.g., a series) of enhancers which functions in the host cell line, operably linked to a DNA 
segment encoding BGL6, such that BGL6 is expressed in the cell line. 

A. Nucleic Acid Constructs/Expression Vectors. 

Natural or synthetic polynucleotide fragments encoding BGL6 ("BGL6-encoding 
nucleic acid sequences") may be incorporated into heterologous nucleic acid constructs or 
vectors, capable of introduction into, and replication in, a filamentous fungal or yeast cell. 
The vectors and methods disclosed herein are suitable for use in host cells for the 
expression of BGL6. Any vector may be used as long as it is replicable and viable in the 
cells into which it is introduced. Large numbers of suitable vectors and promoters are 
known to those of skill in the art, and are commercially available. Cloning and expression 
vectors are also described in Sambrook ef a/., 1989, Ausubel FM etai, 1989, and Strathem 
etai, 1981, each of which is expressly incorporated by reference herein. Appropriate 
expression vectors for fungi are described in van den Hondel, C.A.M.J.J. et al. (1991) In: 
Bennett. J.W. and Lasure, LL. (eds.) More Gene Manipulations in Fungi. Academic Press, 
pp. 396-428. The appropriate DNA sequence may be inserted into a plasmid or vector 
(collectively referred to herein as "vectors") by a variety of procedures. In general, the DNA 
sequence is inserted into an appropriate restriction endonuclease site(s) by standard 
procedures. Such procedures and related sub-cloning procedures are deemed to be within 
the scope of knowledge of those skilled in the art. 

Recombinant filamentous fungi comprising the coding sequence for BGL6 may be 
produced by introducing a heterologous nucleic acid construct comprising the BGL6 coding 
sequence Into the cells of a selected strain of the filamentous fungi. 
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Once the desired form of a bgl6 nucleic acid sequence, homologue, variant or 
fragn^ent thereof, is obtained, it may be modified in a variety of ways. Where the sequence 
involves non-coding flanking regions, the flanl<lng regions may be subjected to resection, 
mutagenesis, etc. Thus, transitions, transversions, deletions, and insertions may be 
performed on the naturally occurring sequence. 

A selected bgl6 coding sequence may be inserted into a suitable vector according to 
well-known recombinant techniques and used to transform filamentous fungi capable of 
BGL6 expression. Due to the inherent degeneracy of the genetic code, other nucleic acid 
sequences which encode substantially the same or a functionally equivalent amino acid 
sequence may be used to clone and express BGL6. Therefore it is appreciated that such 
substitutions in the coding region fail within the sequence variants covered by the present 
invention. Any and all of these sequence variants can be utilized in the same way as 
described herein for a parent BGL6-encoding nucleic acid sequence. 

The present invention also includes recombinant nucleic acid constructs comprising 
one or more of the BGL6-encoding nucleic acid sequences as described above. The 
constructs comprise a vector, such as a plasmid or viral vector, into which a sequence of the 
invention has been inserted, in a forward or reverse orientation. 

Heterologous nucleic acid constructs may include the coding sequence for bgl6, or a 
variant, fragment or splice variant thereof: (i) in isolation; (ii) in combination with additional 
coding sequences; such as fusion protein or signal peptide coding sequences, where the 
bgl6 coding sequence is the dominant coding sequence; (iii) in combination with non-coding 
sequences, such as introns and control elements, such as promoter and terminator 
elements or 5' and/or 3' untranslated regions, effective for expression of the coding 
sequence in a suitable host; and/or (iv) in a vector or host environment in which the bgl6 
coding sequence is a heterologous gene. 

In one aspect of the present invention, a heterologous nucleic acid construct is 
employed to transfer a BGL6-encoding nucleic acid sequence into a cell in vitro, with 
established filamentous fungal and yeast lines prefenred. For long-term, high-yield 
production of BGL6. stable expression is preferred. It follows that any method effective to 
generate stable transfomnants may be used in practicing the invention. 

Appropriate vectors are typically equipped with a selectable mariner-encoding nucleic 
acid sequence, insertion sites, and suitable control elements, such as promoter and 
termination sequences. The vector may comprise regulatory sequences, including, for 
example, non-coding sequences, such as introns and control elements, /.e., promoter and 
temninator elements or 5* and/or 3' untranslated regions, effective for expression of the 
coding sequence in host cells (and/or in a vector or host cell environment in which a 
modified soluble protein antigen coding sequence is not nomially expressed), operably 
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linked to the coding sequence. Large numbers of suitable vectors and promoters are known 
to those of skill in the art. many of which are commercially available and/or are described in 
Sambrook. et aL, (supra). 

Exemplary promoters include both constitutive promoters and inducible promoters, 
examples of which include a CMV promoter, an SV40 early promoter, an RSV promoter, an 
EF-1a promoter, a promoter containing the tet responsive element (TRE) in the tet-on or tet- 
off system as described (CIonTech and BASF), the beta actin promoter and the 
metallothionine promoter that can upregulated by addition of certain metal salts. A 
promoter sequence is a DNA sequence which is recognized by the particular filamentous 
fungus for expression purposes. It is operably linked to DNA sequence encoding a BGL6 
polypeptide. Such linkage comprises positioning of the promoter with respect to the 
initiation codon of the DNA sequence encoding the BGL6 polypeptide in the disclosed 
expression vectors. The promoter sequence contains transcription and translation control 
sequence which mediate the expression of the BGL6 polypeptide. Examples include the 
promoters from the Aspergillus niger, A awamori or A. oryzae glucoamylase, alpha-amylase, 
or alpha-glucosidase encoding genes; the A. riidulans gpdA or trpC genes; the Neurospora 
crassa cbh1 or trp1 genes; the A. niger or Rhizomucor miehei aspartlc proteinase encoding 
genes; the T. reeseicbhl, cbh2, egl1, egl2, or other cellulase encoding genes. 

The choice of the proper selectable marker will depend on the host cell, and 
appropriate markers for different hosts are well known in the art. Typical selectable marker 
genes include argB from A, nidulans or T reesei, amdS from A. nidulans, pyr4 from 
Neurospora crassa or T. reesei, pyrG from Aspergillus niger or A. nidulans. Additional 
exemplary selectable markers include, but are not limited to trpc, trp1, oliC31. niaD or leu2, 
which are included in heterologous nucleic acid constructs used to transform a mutant strain 
such as trp-, pyr-, leu- and the like. 

Such selectable markers confer to transformants the ability to utilize a metabolite that 
is usually not metabolized by the filamentous fungi. For example, the amdS gene from T. 
reese/ which encodes the enzyme acetamidase that allows transformant cells to grow on 
acetamide as a nitrogen source. The selectable marker (e.g. pyrG) may restore the ability of 
an auxotrophic mutant strain to grow on a selective minimal medium or the selectable 
mari<er (e.g. olic31 ) may confer to transformants the ability to grow in the presence of an 
inhibitory drug or antibiotic. 

The selectable marker coding sequence is cloned into any suitable plasmid using 
methods generally employed in the art. Exemplary plasmids include pUC18, pBR322, and 
pUCIOO. 

The practice of the present invention will employ, unless otherwise indicated, 
conventional techniques of molecular biology, microbiology, recombinant DNA, and 
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immunology, which are within the skill of the art. Such techniques are explained fully in the 
literature. See, for example, Sambrook etaL, 1989; Freshney, 1987; Ausubel, ef a/., 1993; 
and Coligan et aL, 1991, All patents, patent applications, articles and publications 
mentioned herein, are hereby expressly incorporated herein by reference. 

B. Host Cells and Culture Conditions For Enhanced BGL6 Production 
(i) Filamentous Funoi 

Thus, the present invention provides filamentous fungi comprising cells which have 
been modified, selected and cultured in a manner effective to result in enhanced BGL6 
production or expression relative to the corresponding non-transformed parental fungi. 

Examples of species of parental filamentous fungi that may be treated and/or 
modified for enhanced BGL6 expression include, but are not limited to Trichoderma, e.g., 
Trichoderma reesei, Trichoderma longibrachiatum , Trichoderma viride, Trichoderma 
koningii; PenicilHum sp., Humicola sp., including Humicola insolens; Aspergillus sp., 
Chrysosporium sp., Fusarium sp., IHypocrea sp., and Emericella sp. 

BGL6 expressing cells are cultured under conditions typically employed to culture the 
parental fungal line. Generally, cells are cultured in a standard medium containing 
physiological salts and nutrients, such as described in Pourquie, J. et aL, Biochemistry and 
Genetics of Cellulose Degradation, eds. Aubert, J. P. et al., Academic Press, pp. 71-86. 
1988 and llmen, M. et al., Appl. Environ. Microbiol. 63:1298-1306, 1997. Culture conditions 
are also standard, e.g., cultures are incubated at 28''C in shaker cultures or femienters until 
desired levels of BGL6 expression are achieved. 

Prefen^ed culture conditions for a given filamentous fungus may be found in the 
scientific literature and/or from the source of the fungi such as the American Type Culture 
Collection (ATCC; "http://www.atcc.orgr). After fungal growth has been established, the 
cells are exposed to conditions effective to cause or permit the over expression of BGL6. 

In cases where a BGL6 coding sequence is under the control of an Inducible 
promoter, the inducing agent, e.g., a sugar, metal salt or antibiotics, is added to the medium 
at a concentration effective to induce high-level BGL6 expression. 

(li) Yeast 

The present invention also contemplates the use of yeast as a host cell for BGL6 
production. Several other genes encoding hydrolytic enzymes have been expressed in 
various strains of the yeast S. cerevisiae. These include sequences encoding for two 
endoglucanases (Penttila etaL, 1987), two cellobiohydrolases (Penttila etaL, 1988) and one 
beta-glucosldase from Trichoderma reese/ (Cummings and Fowler, 1996), a xylanase from 
Aureobasldlium pullulans (Li and Ljungdahl, 1996), an alpha-amylase from wheat (Rothstein 
et aL, 1987), etc. In addition, a cellulase gene cassette encoding the Butyrivibrio 
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nbrisolvens endo- [beta] -1,4-glucanase (END1), Phanerochaete chrysosporium 
cellobiohydrolase (CBH1), the Ruminococcus flavefaciens cellodextrinase {CEL1) and the 
Endomyces fibrilizer cellobiase (BgI1) was successfully expressed In a laboratory strain of S. 
cerevisiae (Van Rensburg ef aL, 1998). 

C. Introduction of a BGL6-Encodina Nucleic Acid Sequence into Host Ceils. 

The Invention further provides cells and cell compositions which have been 
genetically modified to comprise an exogenously provided BGL6-encoding nucleic acid 
sequence. A parental cell or cell line may be genetically modified (/.e., transduced, 
transformed or transfected) with a cloning vector or an expression vector. The vector may 
be, for example, in the form of a plasmid, a viral particle, a phage, etc, as further described 
above. 

Various methods may be employed for delivering an expression vector into cells in 
vitro. After a suitable vector is constructed, it is used to transform strains of fungi or yeast. 
General methods of introducing nucleic acids into cells for expression of heterologous 
nucleic acid sequences are known to the ordinarily skilled artisan. Such methods include, 
but not limited to, electroporation; nuclear microinjection or direct microinjection into single 
cells; bacterial protoplast fusion with intact cells; use of polycatlons, e.g., polybrene or 
polyornithine; membrane fusion with liposomes, lipofectamine or ilpofection-mediated 
transfection; high velocity bombardment with DNA-coated microprojectiles; incubation with 
calcium phosphate-DNA precipitate; DEAE-Dextran mediated transfection; infection with 
modified viral nucleic acids; and the like. 

Prefenred methods for introducing a heterologous nucleic acid construct (expression 
vector) into filamentous fungi {e.g.. T. reesei) include, but are not limited to the use of a 
particle or gene gun, permeabilization of filamentous fungi cells walls prior to the 
transfonmation process (e.g., by use of high concentrations of alkali, e.g., 0.05 M to 0.4 M 
CaCl2 or lithium acetate), protoplast fusion or agrobacterium mediated transformation. An 
exemplary method for transformation of filamentous fungi by treatment of protoplasts or 
spheroplasts with polyethylene glycol and CaCb is described in Campbell, E.I. et aL, Cun*. 
Genet. 16:53-56, 1989 and Penttila, M. et al., Gene, 63:11-22, 1988. 

In addition, heterologous nucleic acid constructs comprising a BGL6-encoding 
nucleic acid sequence can be transcribed in vib^o, and the resulting RNA introduced into the 
host cell by well-known methods, e.g., by injection. 

Follovwng introduction of a heterologous nucleic acid construct comprising the coding 
sequence for bgl6, the genetically modified cells can be cultured in conventional nutrient 
media modified as appropriate for activating promoters, selecting transfomiants or 
amplifying expression of a BGL6-encoding nucleic acid sequence. The culture conditions, 
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such as temperature, pH and the like, are those previously used for the host cell selected for 
expression, and will be apparent to those skilled in the art. 

The progeny of cells into which such heterologous nucleic acid constructs have been 
introduced are generally considered to comprise the BGL6-encoding nucleic acid sequence, 
found in the heterologous nucleic acid construct. 

The invention further includes novel and useful transformants of filamentous fungi 
such as Trichoderma reesei for use in producing fungal cellulase compositions. The 
invention includes transformants of filamentous fungi especially fungi comprising the jbg/6 
coding sequence, comprising a modified form of the bgl6 coding sequence or deletion of the 
bgl6 coding sequence. 

Stable transformants of filamentous fungi can generally be distinguished from 
unstable transformants by their faster growth rate and the formation of circular colonies with 
a smooth rather than ragged outline on solid culture medium. Additionally, in some cases, a 
further test of stability can be made by growing the transformants on solid non-selective 
medium, harvesting the spores from this culture medium and determining the percentage of 
these spores which will subsequently germinate and grow on selective medium. 

VII. Analvsis For BGL6 Nu cleic Acid Ck^dino Sequences and/or Protein Exoression. 

In order to evaluate the expression of BGL6 by a cell line that has been transformed 
with a BGL6-encoding nucleic acid construct, assays can be carried out at the protein level, 
the RNA level or by use of functional bloassays particular to glucosidase activity and/or 
production. 

In one exemplary application of the bgl6 nucleic acid and protein sequences 
described herein, a genetically modified strain of filamentous fungi, e.g., Trichoderma 
reesei, is engineered to produce an Increased amount of BGL6. Such genetically modified 
filamentous fungi would be useful to produce a cellulase product with greater increased 
cellulolytic capacity. In one approach, this is accomplished by Introducing the coding 
sequence for bgl6 Into a suitable host, e.g., a filamentous fungi such as Trichoderma reesei. 

Accordingly, the invention includes methods for expressing BGL6 In a filamentous 
fungus or other suitable host by Introducing an expression vector containing the DNA 
sequence encoding BGL6 Into cells of the filamentous fungus or other suitable host. 

In another aspect, the invention includes methods for modifying the expression of 
BGL6 in a filamentous fungus or other suitable host. Such modification Includes a decrease 
or elimination in expression, or expression of an altered form of BGL6. An altered form of 
BGL6 may have an altered amino add sequence or an altered nucleic acid sequence. 

In general, assays employed to analyze the expression of BGL6 include. Northern 
blotting, dot blotting (DNA or RNA analysis), RT-PCR (reverse transcriptase polymerase 
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chain reaction), or in situ hybridization, using an appropriately labeled probe (based on the 
nucleic acid coding sequence) and conventional Southern blotting and autoradiography. 

In addition, the production and/or expression of BGL6 may be measured in a sample 
directly, for example, by assays for glucosidase activity, expression and/or production. Such 
assays are described, for example, in Chen et ai (1992), Herr et al. (1978), and U.S. Patent 
No. 6,184,018 (Li eta!/, 2001), each of which is expressly incorporated by reference herein. 
The ability of BGL6 to hydrolyze isolated soluble and insoluble substrates can be measured 
using assays described in Suurnakki ef al. (2000) and Ortega et ai (2001 ). Substrates 
useful for assaying celloblohydrolase, endoglucanase or p-glucosidase activities include 
crystalline cellulose, filter paper, phosphoric acid swollen cellulose, hydroxyethyl cellulose, 
carboxymethyl cellulose, cellooligosaccharides, methylumbelliferyl lactoside, 
methylumbelllferyl cellobioslde, orthonitrophenyl lactoside, paranitrophenyl lactoside, 
orthonitrophenyl cellobioside, paranitrophenyl cellobioside, orthonitrophenyl glucoside, 
paranitrophenyl glucoside, methylumbelliferyl glycoside. The latter three are particulariy 
useful In assaying p-glucosidases. p-glucosidase assays are well-known in the art. See 
Cummings and Fowler (1996). 

In addition, protein expression, may be evaluated by immunological methods, such 
as immunohistochemical staining of cells, tissue sections or Immunoassay of tissue culture 
medium, e.g., by Western blot or ELISA. Such immunoassays can be used to qualitatively 
and quantitatively evaluate expression of BGL6. The details of such methods are known to 
those of skill in the art and many reagents for practicing such methods are commercially 
available. 

A purified form of BGL6 may be used to produce either monoclonal or polyclonal 
antibodies specific to the expressed protein for use in various immunoassays. (See, e.g,, 
Hu ef a/., 1991). Exemplary assays include EUSA, competitive immunoassays, 
tBdioimmunoassays, Westem blot, indirect immunofluorescent assays and the like. In 
general, commercially available antibodies and/or kits may be used for the quantitative 
immunoassay of the expression level of glucosidase proteins. 
VUI. Isolation And Purification Of Recombinant BGL6 Protein. 

In general, a BGL6 protein produced in cell culture is secreted into the medium and 
may be purified or isolated, e.g., by removing unwanted components from the cell culture 
medium. However, in some cases, a BGL6 protein may be produced In a cellular fonn 
necessitating recovery from a cell lysate. In such cases the BGL6 protein is purified from 
the cells In which it was produced using techniques routinely employed by those of skill in 
the art. Examples include, but are not limited to, affinity chromatography (Tilbeurgh et al., 
1984), ion-exchange chromatographic methods (Goya! ef a/.. 1991; Fliess efa/., 1983; 
Bhikhabhal etaL, 1984; Ellouz ef a/., 1987), including ion-exchange using materials with 
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high resolution power (Medve et a/,, 1998), hydrophobic interaction chromatography (Tomaz 
and Queiroz, .1999), and two-phase partitioning (Bmmbauer, ef a/., 1999). 

Typically, the BGL6 protein is fractionated to segregate proteins having selected 
properties, such as binding affinity to particular binding agents, e.g., antibodies or receptors; 
or which have a selected molecular weight range, or range of isoelectric points. 

Once expression of a given BGL6 protein is achieved, the BGL6 protein thereby 
produced is purified from the cells or cell culture. Exemplary procedures suitable for such 
purification Include the following: antibody-affinity column chromatography, ion exchange 
chromatography; ethanol precipitation; reverse phase HPLC; chromatography on silica or on 
a cation-exchange resin such as DEAE; chromatofocusing; SDS-PAGE; ammonium sulfate 
precipitation; and gel filtration using, e.g., Sephadex G-75. Various methods of protein 
purification may be employed and such methods are known in the art and described e.g. in 
Deutscher, 1990; Scopes, 1982. The purification step(s) selected will depend, e.g.. on the 
nature of the production process used and the particular protein produced. 

IX. utiHtvof bale and BGL6 

It can be appreciated that the bgl6 nucleotide, the BGL6 protein and compositions 
comprising BGL6 protein activity find utility in a wide variety applications, some of which are 
described below. 

New and Improved cellulase compositions that comprise varying amounts CBH-type, 
EG-type and BG-type celiulases find utility In detergent compositions that exhibit enhanced 
cleaning ability, function as a softening agent and/or Improve the feel of cotton fabrics (e.g., 
"stone washing" or "biopolishing"), In compositions for degrading wood pulp into sugars 
(e.g., for blo-ethanol production), and/or in feed compositions. The isolation and 
characterization of cellulase of each type provides the ability to control tiie aspects of such 
compositions. 

In one prefenred approach, the cellulase of the Invention finds utility in detergent 
compositions or in the treatment of fabrics to improve the feel and appearance. 

The inventive p-glucosidases can be used in a variety of different applications. For 
example, the p-glucosidase may be added to grapes during wine making to enhance the 
potential aroma of the finished wine product. Yet another application can be to use p- 
glucosldase In fruit to enhance the aroma thereof. Alternatively, the Isolated recombinant 
femientation product containing enhanced p-glucosidase can be used directiy in food 
additives or wine processing to enhance tiie flavor or aroma. 

Since the rate of hydrolysis of cellulosic products may be Increased by using a 
transformanl having at least one additional copy of the bgl6 gene inserted into the genome, 
products that contain cellulose or heteroglycans can be degraded at a faster rate and to a 
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greater extent. Products made from cellulose such as paper, cotton, cellulosic diapers and 
the like can be degraded more efficiently in a landfill. Thus, the femientation product 
obtainable from the transfonrnants or the transformants alone may be used in compositions 
to help degrade by liquefaction a variety of cellulose products that add to the overcrowded 
landfills. 

Separate saccharrfication and fenmentation is a process cellulose present in 
biomass, e.g., com stover, is converted to glucose and subsequently yeast strains convert 
glucose into ethanol. Simultaneous saccharification and fermentation is a process whereby 
cellulose present in biomass, e.g., com stover, is converted to glucose and, at the same 
time and in the same reactor, yeast strains convert glucose into ethanol. Thus, in another 
preferred approach, the glucosidase type cellulase of the invention finds utility in the 
degradation of biomass to ethanol. Ethanol production from readily available sources of 
cellulose provides a stable, renewable fuel source. 

Cellulose-based feedstocks are comprised of agricultural wastes, grasses and 
woods and other low-value biomass such as municipal waste (e.g., recycled paper, yard 
clippings, etc.). Ethanol may be produced from the fermentation of any of these cellulosic 
feedstocks. However, the cellulose must first be converted to sugars before there can be. 
conversion to ethanol. 

A large variety of feedstocks may be used with the inventive p-glucosidase and the 
one selected for use may depend on the region where the conversion is being done. For 
example, in the Midwestern United States agricultural wastes such as wheat straw, corn 
stover and bagasse may predominate while in Califomia rice straw may predominate. 
However, it should be understood that any available cellulosic biomass may be used in any 
region. 

A cellulase composition containing an enhianced amount of p-glucosidase finds utility 
in ethanol production. Ethanol from this process can be further used as an octane enhancer 
or directly as a fuel in lieu of gasoline which is advantageous because ethanol as a fuel 
source is more environmentally friendly than petroleum derived products. It is known that 
the use of ethanol wilf improve air quality and possibly reduce local ozone levels and smog. 
Moreover, utilization of ethanol in lieu of gasoline can be of strategic importance in buffering 
the impact of sudden shifts in non-renewable energy and petro-chemical supplies. 

Ethanol can be produced via saccharification and fermentation processes from 
cellulosic biomass such as trees, herbaceous plants, municipal solid waste and agricultural 
and forestry residues. However, one major problem encountered in this process is the lack 
of p-glucosidase in the system to convert cellobiose to glucose. It is known that cellobiose 
acts as an inhibitor of cellobiohydrolases and endoglucanases and thereby reduces the rate 
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Of hydrolysis for the entire cellulase system. Therefore, the use of increased p-glucosidase 
activity to quicl^ly convert cellobiose into glucose would greatly enhance the production of 
ethanol. 

Thus, the inventive p-glucosidase finds use in the hydrolysis of cellulose to Its sugar 
components. In one embodiment, the p-glucosidase is added to the biomass prior to the 
addition of a fermentative organism. In a second embodiment, the p-glucosidase is added 
to the biomass at the same time as a fermentative organism. Optionally, there may be other 
cellulase components present in either embodiment. 

In another embodiment the cellulosic feedstock may be pretreated. Pretreatment 
may be by elevated temperature and the addition of either of dilute acid, concentrated acid 
or dilute alkali solution. The pretreatment solution is added for a time sufficient to at least 
partially hydrolyze the hemlcellulose components and then neutralized. 

In an alternative approach, a cellulase composition which is deficient In or free of p- 
glucosidase is preferred. The deletion of the p-glucosidase gene of this Invention would be 
particularly useful in preparing cellulase compositions for use in detergents. Additionally, 
such compositions are useful for the production of cellobiose and other 
cellooligosaccharides. The deletion of the bgl6 gene from T. reesei strains would be 
particularly useful in preparing cellulase compositions for use in the detergents and in 
isolating cellobiose. The cellulase enzymes have been used In a variety of detergent 
compositions to enzymatically clean clothes. However, it is known in this art that use of 
cellulase enzymes can impart degradation of the cellulose fibers in clothes. One possibility 
to decrease the degradation effect is to produce a detergent that does not contain P- 
glucosidase. Thus, the deletion of this protein would effect the cellulase system to inhibit 
the other components via accumulation of cellobiose. The modified microorganisms of this 
invention are particularly suitable for preparing such compositions because the bgl6 gene 
can be deleted leaving the remaining CBH and EG components resulting in improved 
cleaning and softening benefits in the composition without degradative effects. 

The detergent compositions of this invention may employ besides the cellulase. 
composition (irrespective of the p-glucosidase content, i.e., p-glucosidase-free, substantially 
P-glucosidase-free. or p-glucosidase enhanced), a surfactant, including anionic, non-ionic 
and ampholytic surfactants, a hydrolase, building agents, bleaching agents, bluing agents 
and fluorescent dyes, caking inhibitors, solubilizers, cationic surfactants and the like. All of 
these components are known in the detergent art. The cellulase composition as described 
above can be added to the detergent composition either in a liquid diluent, in granules, in 
emulsions, in gels, in pastes, and the like. Such fonns are well known to the skilled artisan. 
When a solid detergent composition is employed, the cellulase composition is preferably 
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formulated as granules. Preferably, the granules can be fomiulated so as to contain a 
cellulase protecting agent For a more thorough discussion, see US Patent Number 
6.162,782 entitled "Detergent compositions containing cellulase compositions deficient In 
CBH I type components," which is incorporated herein by reference. 

In yet another embodiment, the detergent compositions can also contain enhanced 
levels of beta-glucosidase or altered beta-glucosidase. In this regard, it really depends upon 
the type of product one desires to use in detergent compositions to give the appropriate 
effects. 

Preferably the cellulase compositions are employed from about 0.00005 weight 
percent to about 5 weight percent relative to the total detergent composition. More 
preferably, the cellulase compositions are employed from about 0.0002 weight percent to 
about 2 weight percent relative to the total detergent composition. 

Deletion of the bgl6 gene would also provide accumulation of cellobiose in the 
cellulase system, which can be purified therefrom. In this regard, the present invention 
presents the possibility to isolate cellobiose from microorganisms in an easy and effective 
manner. 

Portions of the bgl6 nucleic acid sequence that are capable of binding to cellulose 
can be used to generate bacterial chimeric surface proteins, allowing whole-cell 
immobilization onto cellulose filters or other fibrous solid supports as described in Lehtio et 
a/., 2001. 

In addition the bgl6 nucleic acid sequence finds utility in the identification and 
characterization of related nucleic acid sequences. A number of techniques useful for 
determining (predicting or confirming) the function of related genes or gene products 
Include, but are not limited to, (A) DNA/RNA analysis, such as (1) overexpression, ectopic 
expression, and expression in other species; (2) gene knock-out (reverse genetics, targeted 
knock-out, viral induced gene silencing (VIGS, see Baulcombe, 1999); (3) analysis of the 
methylation status of the gene, especially flanking regulatory regions; and (4) in situ 
hybridization; (B) gene product analysis such as (1) recombinant protein expression; (2) 
antisera production, (3) immunolocalization; (4) biochemical assays for catalytic or other 
activity; (5) phosphorylation status; and (6) interaction with other proteins via yeast two- 
hybrid analysis; (C) pathway analysis, such as placing a gene or gene product within a 
particular biochemical or signaling pathway based on its overexpression phenotype or by 
sequence homology with related genes; and (D) other analyses which may also be 
performed to determine or confirm the participation of the isolated gene and its product in a 
particular metabolic or signaling pathway, and help determine gene function. 
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Endogiucanases and beta-glucpsidases may be responsible for the production of 
disaccliarides. sucli as sopliorose, from celiooligosaccliarides and glucose by 
transglycosylation reactions. Sophorose is known to be a very potent inducer of cellulase 
gene expression (llmen, M. et al., 1997, AppL Environ. Microbiol. 63:1298-1306 and 
references therein). In this way EGs and BGLs may play an important role in the process of 
induction of cellulase gene expression. Over-expression of certain EGs or BGLs in a fungal 
strain may lead to higher overall cellulase productivity by that strain. 

A. HonfioloavTo Known Sequences 

The function of a related BGLB-encoding nucleic acid sequence may be 
determined by homology to Icnown genes having a particular function. For example, 
a comparison of the coding sequence of an identified nucleic acid molecule to public nucleic 
acid sequence databases is used to confimi function by homology to known genes or by 
extension of the identified nucleic acid sequence. 

The tenm "% homology" is used interchangeably herein with the terni "% identity" 
herein and refers to the level of nucleic acid or amino acid sequence identity between the . 
nucleic acid sequence that encodes BGL6 or the BGL6 amino acid sequence, when aligned 
using a sequence alignment program. 

For example, as used herein, 80% homology means the same thing as 80% 
sequence identity determined by a defined algorithm, and accordingly a homologue of a given 
sequence has greater than 80% sequence identity over a length of the given sequence. 
Exemplary levels of sequence identity include, but are not limited to, 80, 85, 90, 95, 98% or 
more sequence identity to a given sequence, e.g., the coding sequence for bgl6, as described 
herein. 

Exemplary computer programs which can be used to determine identity between two 
sequences include, but are not limited to, the suite of BLAST programs, e.g., BLASTN, 
BLASTX, and TBLASTX, BLASTP and TBLASTN, publicly available on the Internet at 
httD://www.ncbi.nlm.nlh.aov/Bl-AST/ . See also, Altschul, ef a/., 1990 and Altschul, et aL, 
1997. 

Sequence searches are typically carried out using the BLASTN program when 
evaluating a given nucleic acid sequence relative to nucleic acid sequences in the GenBank 
DNA Sequences and other public databases. The BLASTX program is prefen-ed for 
searching nucleic acid sequences that have been translated in all reading frames against 
amino acid sequences in the GenBank Protein Sequences and other public databases. 
Both BLASTN and BLASTX are run using default parameters of an open gap penalty of 
1 1.0, and an extended gap penalty of 1.0, and utilize the BLOSUM-62 matrix. (See, e.g., 
Altschul, efa/., 1997.) 
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A preferred alignment of selected sequences in order to detemiine "% identity" 
between two or more sequences, is perfomied using for example, the CLUSTAL-W program 
in MacVector version 6.5, operated with default parameters, Including an open gap penalty 
of 10.0, an extended gap penalty of 0.1, and a BLOSUM 30 similarity matrix. 

In one exemplary approach, sequence extension of a nucleic acid encoding bgl6 may 
be earned out using conventional primer extension procedures as described in Sambrook et 
ah, supra, to detect bgl6 precursors and processing intenmedlates of mRNA that may not 
have been reverse-transcribed into cDNA and/or to identify ORFs that encode a full length 
protein. 

In yet another aspect, the present invention includes the entire or partial nucleotide 
sequence of the nucleic acid sequence of bgl6 for use as a probe. Such a probe may be 
used to identify and clone out homologous nucleic acid sequences from related organisms. 

Screening of a cDNA or genomic library with the selected probe may be conducted 
using standard procedures, such as described in Sambrook ef aA, (1989). Hybridization 
conditions, including moderate stringency and high stringency, are provided in Sambrook et 
a/., supra. 

The probes or portions thereof may also be employed in PGR techniques to generate 
a pool of sequences for identification of closely related bgl6 sequences. When bgl6 
sequences are intended for use as probes, a particular portion of a BGL6 encoding 
sequence, for example a highly conserved portion of the coding sequence may be used. 

For example, a bgl6 nucleotide sequence may be used as a hybridization probe for a 
cDNA library to isolate genes, for example, those encoding naturally-occurring variants of 
BGL6 from other fungal, bacterial or plant species, which have a desired level of sequence 
identity to the bgl6 nucleotide sequence disclosed in Figure 1 (SEQ ID NO:1). Exemplary 
probes have a length of about 20 to about 50 bases. 

B. Two Hybrid Analysis 

Proteins identified by the present invention can be used in the yeast two-hybrid 
system to "capture" protein binding proteins which are putative signal pathway proteins. The 
yeast two hybrid system is described in Fields and Song, Nature 340:245-246 (1989). 
Briefly, in a two-hybrid system, a fusion of a DNA-binding domain-tog/6(e.g., GAL4-bg/6 
fusion) is constructed and transfected into yeast cells. The whole bgl6 gene, or subregions 
of the bgl6 gene, may be used. A second construct containing the library of potential binding 
partners fused to the DNA activation domain is co-transf acted. Yeast co-transformants 
harboring proteins that bind to the BGL6 protein are identified by, for example, beta- 
galactosidase or luciferase production (a screen), or survival on plates lacking an essential 
nutrient (a selection), as appropriate for the vectors used. 
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C. Microarrav Analysis 

In addition, microan^ay analysis, also known as expression profiling or transcript 
profiling, may be used to simultaneously evaluate the presence or expression of given DNA 
sequences, or changes in the expression of many different genes. In one approach, a large 
set of DNA sequences (probes), usually a broad set of expressed sequence tags, cDNAs, 
cDNA fragments, or sequence-specific oligonucleotides, is anrayed on a solid support such 
as a glass slide or nylon membrane. Labelled target for hybridization to the probes is ' 
generated by isolating mRNA from control and induced tissue, then labeling each mRNA 
pool either directiy or via a cDNA or cRNA intermediate, with a distinct marker, usually a 
fluorescent dye. The microarray is hybridized with the complex probes, and the relative 
hybridization signal intensity associated with each location on the array can be quantitated 
for each marker dye. Differences in expression between the control and induced states can 
be measured as a ratio of the signal from the two marker dyes. (See Baldwin, D ef a/., 
1999.) 

Microarray analysis of the source organism from which bgl6 was derived may be 
carried out. to facilitate the understanding of gene function by identifying other genes that 
are coordinately regulated as a consequence of the overexpression of bgl6. The identity of 
coordinately regulated genes may help to place the bgl6 gene in a particular pathway. 
Altematively, such analysis may be used to identify other genes involved in the same 
pathway using microarray analysis. 

All publications, patents and patent applications are herein expressly incorporated by 
reference in their entirety. 

While tine invention has been described with reference to specific methods and 
embodiments, it will be appreciated tinat various modifications and changes may be made 
without departing from the invention. 



EXAMPLE 1 

In one exemplary approach, a cDNA fragment for use as a probe is isolated by 
extracting total RNA from mycelia of a T. reesei strain grown under conditions known to 
induce cellulase production and obtaining the polyadenylated (polyA) fraction therefrom. 
The polyA RNA is used to produce a cDNA pool which is then amplified using specific 
primers based on the bgl6 nucleic acid sequence provided herein. 

Total RNA is isolated from the mycelia using methods known in the art, for example 
as described in Timberlake ef a/.. 1981; Maniatis, et ai, 1989; Ausubel, et aL. 1993 and 



wo 2004/043980 



37- 



PCT/US2003/035672 



Sambrook etal., 1989, each of which is expressly incorporated by reference herein. Once 
isolated, Northern blots are perfomned to confirm cellulase expression and select an optimal 
induction time for cellulase expression and con-esponding RNA isolation. 

Messenger RNA (mRNA). having a poly (A) tail at the 3' end, may be purified from - 
5 total RNA using methods known in the art. 

The T. reesei RNA is used as template for RT-PCR using methods known in the art 
(Loftus, J. et al.. Science. 249:915-918, 1990). During this procedure the mRNA is reverse 
transcribed to produce first strand cDNA. The cDNA subsequently serves as template for 
PGR amplification ofbgl6 cDNA sequences using specific olionucleotide primers designed 
10 in accordance with SEQ ID No. 1 or SEQ ID No. 3. 



Table 1. Sequences Provided In Support Of The Invention. 



Description 


SEQ. 

roNo. 


full length bgl6 DNA nucleic acid sequence 

GATCACACCCCTCCCACCCTTCTcrrrTTCAAGGTTGTCCCCTTCTCCCACGG 


1 


CTTTATGTACTTCCCACTCmTAATTCGCTCTTTCCATTCCAAGCCAAGCAA 


CAXCTGTGAGCAGCTCATCCTTCCCAATATrxnrrf^r.AATnr;rAr;r,Ar:r^ArsA'r 


gatgggttttgacgtggaggatgttctgtcnxjagctgagccaaaatgagaa 
gattgctctcitgtccggcattgattictggcatacitatccxdatac^ 
tacaacgtcccitcagtccgcctaacggacggtcctaacggcatacgaggc 
acaaagi i 1 1 1 i gctggcattcctgctgcctgcctgccatgtgggacggcc 
ctggcctctaccrgggataagcagctgctgaagaaggctgggaagctgct 
cggtgatgagtgcatcgcaaaaggcgcccactgctggctgggcccaacaa 
tcaatacrcccx^gatctcctctgggggggcgcggcitcgagtcatt^ 
aagatcajtacctgtccggcatcctrgctgcatctatqattctcggcmtg 
aaagcacaggtgtcatcrctgccgtcaaacactttotcgccaacgaccagg 
agcacgagcggcgagcggtcgactgtctcatcacccagcgggctctccgg 
gaggtctatctgcgacccttccagatcgtagcccgagatgcaaggcccggc 
gcattgatgacatcxttacaacaaggtcaatggcaagcacgtcgctgacag 
cgccgagttccttcagggcattctccgga<nx3agtggaattgggatcctcrr 
cattgtcagcgactggtacggcacctacaccactattgatcccatcaaagc 
cggccttgatctcgagatgccgggcgtttcacgatatcgcggcaaatacat 
cgagtctgctctgcaggcccgttrgctgaagcagtccactatcgatgagcg 
cgctcgccgcgtgctcaggttcgcccagaaggccagccatctcaaggtctc 
cgaggtagagcaaggccgtgacttcccagaggatcgcgtcctcaaccgtc 
agatctgcggcagcagcattgtcctactqaagaatgagaactccatcttac 
ctctccccaagtccgtcaagaaggtcgcccttgttggatcccacgtgcgtc 
taccggctatctcgggaggaggcagcgcctctcrrgtcccttactatgcca 
Itatctctatacgatgccgtctctgaggtactagccggtgccacgatcacgc 
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1 ACGAGGTCGGTGCCTATGCCCACCAAATGCTGCCXGTCATCGACGCAATGA 
TCAGCAACGCCGTAATCCACTTCTACAACGACCCCATCX3ATGTCAAAGACA 
GAAAGCrCCTTGGCAGTGAGAACGTATCGTCGACATCGTTCCAGCTCATGG 
ATTACAACAACATCCCAACGCTCAACAAGGCCATGTTCTGGGGTACTCTCG 
TGGGCGAGTTTATCCCTACCGCCACGGGAATTTGGGAATTTGGCCTCAGTG 
TCTTTGGCACTGCCGACCITTATATTGATAATGAGCTCGTGATrGAAAATA 
CAACACATCAGACGCXSTGGTACCGCCnTnTCGGAAAGGGAACGACGGAA 
AAAGTCGCTACCAGGAGGATGGTGGCCGGCAGCACCTACAAGCTGCGTCT 
CGAGTTTGGGTCTGCCAACACGACCAAGATGGAGACGACCGGTGTTGTCA 
ACTTTGGCGGCGGTGCCGTACACCTGGGTGCCTGTCTCAAGGTCGACCCAC 
AGGAGATGATTGCGCGGGCCGTCAAGGCCGCAGCCGATGCCGACTACACC 
ATCATCTGCACGGGACTCAGCXK3CGAGTGGGAGTCTGAGGGTTTTGACCG 
GCCTCACATGGACCTGCCCCXn'GGTGTGGACACCATGATCTCGCAAGTTCT 
TGACGCCGCTCCCAATGCTGTAGTCGTCAACCAGTCAGGCACCCCAGTGAC 
AATGAGCTGGGCTCATAAAGCAAAGGCCATTGTGCAGGCTTGGTATGGTG 
GTAACGAGACAGGCCACGGAATCTCCGATGTGCTCTTTGGCAACGTCAACC 
CGTCGGGGAAACTCTCCCTATCGTGGCCAGTCGATGTGAAGCACAACCCA 
GCATATCTCAACTACGCCAGCGTTGGTGGACGGGTCTTCTATGGCGAGGAT 
GTTTACGTTGGCTACAAGTTCTACGACAAAACGGAGAGGGAGGTTCTGT^ 

ccttttgggcatggcctgtcttacgctacctrcaagcrcccag 

tgaggacggtccccgaaaccttccacccggaccagcccacagtagccatt 

gtcaagatcaagaacacgagcagtgtcccgggcgcccaggtcctgcagct 

atacatttcggccccaaactcgcctacacatcgcccggtcaaggagctgca 

cggattcgaaaaggtgtatcttgaagctggcgaggagaaggaggtacaaa 

tacixattgacxjagtacgctacragcrtctgggacgagattgagagcatct 

ggaagagcgagaggggcatttatgatgtgcrtgtaggattctcgagtcag 

gaaatctcgggcaaggggaagctgattgtgcctgaaacgcgattctggat 

ggggctgtagattcaacacgtgagcaaaagcgattgcggaaagtaccaga 




AAAGCCAAGGGAGTCAAAGGATGGGAACTTGTGTCAATAGAAGATATGCA 


TAGATGGGCATTCTGGGATGGTQGTTTGGCATTAATGCAAAGAAGACAAA 


QAIGGATGTGATAAAAAAAAAAAAAAAAAAA 




BGL6 predicted amino acid sequence 

MGEWQEQMMGFDVEDVI^QI^QNEBGLM.I^GroFWHTYPffKY>r\^S\^TO 

GPNGIRGTKFFAGIPAACIJPCGTALASTWDKQU.KKAGKLLGDECIAKGAHC 

WIXJPTINTPRSPLGGRGFESFSEDPYIJSGILAASMIIXjCESTGVISAVKHI^ 

QEHERRAVDCLITQRAIJREVYLRPFQIVARDARPGAmTSYNKVNGKHVADS 

AEFIXJGIUITEWNWDPIJVSDWGTYTTIDAIKAGIJDLEMPGVS 

ALQAM-lJKQSTroERARRVOlFAQKASHLKVSEVEQGRDFPEDRVLNRQICGS 

SIVLIJK>ffiNSIIPIJPKSVKKVALVGSHVRLPAISGGGSASLVPYYAISLYDAVSE 

VJLAOAl ITHEVGAYAHQMIJ VIDAMISNAVIHFYITOPIDVKDR^ 

TSFQIAlDYNNn>TIJNKAMFWGTLVGEFIPT^ 

ViJtNiiriQmGTAFFGKGTTEKVATRRMVAGSTYKIJa.EFGSANTT^^ 

VVOTGGGAVHLGACLKVDPQEMIARAVKAAADADYTnCTGLSGEWESEGFD 

RPHMDIPPGVDTMISQVIJDAAPNAVVVNQSGTPVTMSWAHKAKAIVQAWY 

GGl^GHGISDVLFGNVOTSGKI^LSWPVDVKHNPAYLNYASVGGRVLYGE 

DVYVGYKFYDKTEREVIJTFGHG15YATFKIJ>DSTVRTWETFHPDQFIVAIV 

KIKNTSSWGAOVIXDLYISAPNSPIHRPVKEIJIGFEKVYI^GEEKEVOIPIDO 


2 
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1 YATSFWDEffiSMWKSERGIYDVLVGFSSOEISGKGKLIWETR^ 




BGL6 predicted amino acid seqaence with alternate start 

MMGFDVEDVLSQI^QNEKIAIXSGroFWHT^ 

KFFAGIPAACTJ>CGTALASTWDKQLIJOU^GKLIX5DECI^ 

RSPLGGRGFESFSEDPYLSGIIJ\ASMILGCESTGVISAVKHFVANDQ 

DCLITQRALREVYLRPFQWARDARPGAmTSYlSnK: 

TEWNWDPLXVSDWYGTYTTTOAIKAGIJDLE]!^ 

QSTTOERARRVLRFAQKASHLKVSEVEQGRDFPEDRVIJ^QICGSSIV^ 

SIIJ>IJPKSVKKVALVGSHVRIJ*AISGGGSASLWYYAISLYDAVSEVLAGATr^ 

HEVGAYAHQMI^VroAMISNAVIHFYNDProVKDRKLI^ 

NNIPTLNKAMFWGTLVGEFIFrATGIWEFGLSWGTADLYro^ 

RGTAFFGKGTTEKVATRRMVAGSTYKLRLEFGSANTTKMET^ 

VHLGACLKVDPQEMIARAVKAAADADYTnCTGI^GEWESEGFDRPHI^ 

GVDXmSQVU^AAPNAVVWQSGTPVTMSWAHKAI^^ 

ISDVU^GNVNPSGKI^I^WPVDVKHNPAYIJs^ 

YDKTEREVIJTFGHG1^YATFKIJ>DSTVRTWETFOT 

GAQVIXJLYISAPNSPTHRPVKEUIGFEKVYIJEAGEEKEVQIPIDQYATSFW^ 
ESMWKSERGIYDVLVGFSSOEISGKGKLIVPETRFWMGL 


4 1 


bgl6nncleie acid coding sequence 

ATGGGCGAATGGCAGGAGCAGATGATGGGTTTTGACGTGGAGGATGTTCT 
GTCTCAGCTGAGCCAAAATGAGAAGATTGCTCrCITGTCCGGCAT^ 
CTGGCATACTTATCCCATACCAAAGTACAACGTCCCTTCAGTCCGCCTAAC 
GGACGGTCCTAACGGCATACGAGGCACAAAGTTITTTGCTGGCATTCCT 
TGCCTGCCTGCCATGTGGGACGGCCCTGGCCTCTACCTGGGATAAGCAGCT 
GCTGAAGAAGGCTGGGAAGCTGCTCGGTGATGAGTGCATCGCAAAAGGCG 
CCCACTGCTGGCTGGGCCCAACAATCAATACTCCCCGATCTCCTCTGGGGG 
GGCGCGGCITCGAGTCAllllCGGAAGATCCGTACCTGTCCGGCATCCTTG 
CTGCATCTATGATTCTCGGCTGTGAAAGCACAGGTGTCATCTCTGCCGTCA 
AACAC I'll GTCGCCAACGACCAGGAGCACGAGCGGCG AGCGGTCG ACTGT 
CTCATCACCCAGCGGGCTCTCCGGGAGGTCTATCTGCGACCCTTCCAGATC 
GTAGCCCGAGATGCAAGGCCCGGCGCATTGATGACATCCTACAACAAGGT 
CAATGGCAAGCACGTCGCTGACAGCGCCGAGTTCCTTCAGGGCATTCTCCG 
GACTGAGTGGAATTGGGATCCTCTCATTGTCAGCGACTGGTACGGCACCTA 
CACCACTATTGATGCCATCAAAGCCGGCCTTGATCTCGAGATGCCGGGCGT 
TTCACGATATCGCGGCAAATACATCGAGTCTGCTCTGCAGGCCCGTTTGCT 
GAAGCAGTCCACTATCGATGAGCGCGCTCGCCGCGTGCTCAGGTTCGCCCA 
GAAGGCCAGCCATCTCAAGGTCTCCGAGGTAGAGCAAGGCCGTGACTTCC 
CAGAGGATCGCGTCCTCAACCGTCAGATCTGCGGCAGCAGCATTGTCCTAC 
TGAAGAATGAGAACrCCATCTTACCTCTCCCCAAGTCCGTCAAGAAGGTCG 
CCCTTGTTGGATCCCACGTGCGTCTACCGGCTATCTCGGGAGGAGGCAGCG 
CCTCTCTTGTCCCTTACTATGCCATATCTCTATACGATGCCGTCT 
ACTAGCCGGTGCCACGATCACGCACGAGGTCGGTGCCTATGCCCACCAAA 
TGCTGCCCGTCATCGACGCAATGATCAGCAACGCCGTAATCCACTTCTACA 
ACGACCCCATCGATGTCAAAGACAGAAAGCTCCTTGGCAGTGAGAACGTA 
TCGTCGACATCGTTCCAGCTCATGGATTACAACAACATCCCAACGCTCAAC 
AAGGCCATGTTCTGGGGTACTCTCGTGGGCGAGTTTATCCCTACCGCCACG 
GGAATTTGGGAATTTGGCCrCAGTGTCITTGGCACTGCCGACC^ 
ATAATGAGCTCGTGATTGAAAATACAACACATCAGACGCGTGGTACCGCC 
1 1 1 1 iCGGAAAGGGAACGACGGAAAAAGTCGCTACCAGGAGGATGGTGGC 
CGGCAGCACCTACAAGCTGCGTCTCGAGTTTGGGTCTGCCAACACGACCAA 
GATGGAGACGACCGGTGTTGTCAACTTTGGCGGCGGTGCCGTACACCTGG 
GTGCCTGTCTCAAGGTCGACCCACAGGAGATGATTGCGCGGGCCGTCAAG 
GCCGCAGCCGATGCCGACTACACCATCATCTGCACGGGACTCAGCGGCGA 
GTGGGAGTCTGAGGGTTTTGACCGGCCTCACATGGACCTGCCCCCTGGTGT 
GGACACCATGATCTCGCAAGTTCTTGACGCCGCTCCCAATGCTGTAGTCGT 
CAACCAGTCAGGCACCCCAGTGACAATGAGCTGGGCTCATAAAGCAAAGG 
CCATTGTGCAGGCTTGGTATGGTGGTAACGAGACAGGCCACGGAATCTCC 
GATGTGCTCTTTGGCAACGTCAACCCGTCGGGGAAACTCTCCCTATCGTGG 
CCAGTCGATGTGAAGCACAACCCAGCATATCTCAACTACGCCAGCGTTGGT 
1 GGACGGGTCTTGTATGGCGAGGATGTTTACGTTGGCTACAAGTTCTACGAC 


3 
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AAAACGGAGAGGGAGGTTCTGTTTCCTITrGGGCATGGCCTGTC^ 

ACCITCAAGCTCCCAGATTCTACCGTGAGGACGGTCCCCGAAACCTTCCAC 

CCGGACCAGCCCACAGTAGCCATTGTCAAGATCAAGAACACGAGCAGTGT 

CCCGGGCGCCCAGGTCCTGCAGCTATACATTTCGGCCCCAAACTCGCCTAC 

ACATCGCCCGGTCAAGGAGCTGCACGGATTCGAAAAGGTGTATCTTGAAG 

CTGGCGAGGAGAAGGAGGTACAAATACCCATTGACCAGTACGCTACTAGC 

TTCTGGGACGAGATTGAGAGCATGTGGAAGAGCGAGAGGGGCATTTATGA 

TGTGCTTGTAGGATTCTCGAGTCAGGAAATCTCGGGCAAGGGGAAGCTGA 

TTGTGCCTGAAACGCGATTCTGGATGGGGCTGTAG 



